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A GENERALIZED CHARACTERIZATION 
OF ALGORITHMIC PROBABILITY 

TOM F. STERKENBURG 


Abstract. An a priori semimeasure (also known as “algorithmic probability” 
or “the Solomonoff prior” in the context of inductive inference) is defined as the 
transformation, by a given universal monotone Turing machine, of the uniform 
measure on the infinite strings. It is shown in this paper that the class of a 
priori semimeasures can equivalently be defined as the class of transformations, 
by all compatible universal monotone Turing machines, of any continuous com¬ 
putable measure in place of the uniform measure. Some consideration is given 
to possible implications for the prevalent association of algorithmic probability 
with certain foundational statistical principles. 


1. Introduction 

Levin [23] first considered the transformation of the uniform measure A on the 
infinite bit strings by a universal monotone machine U. This transformation A u is 
the function that for each finite bit string returns the probability that the string is 
generated by machine U, when U is supplied a stream of uniformly random input 
(produced by tossing a fair coin, say). Levin attached to Xu the interpretation of 
an “a priori probability” distribution, because A u dominates every other semicom- 
putable semimeasure and so the initial assumption that a sequence is randomly 
generated from Xu is in an exact sense the weakest of randomness assumptions. 

Earlier on, Solomonoff [20] described in a somewhat less precise way a very 
similar definition. His motivation was an “a priori probability” distribution to serve 
as an objective starting point in inductive inference. In this context the definition 
is known under various headers, including “the Solomonoff prior” and “algorithmic 
probability;” and it has been associated with certain foundational principles from 
statistics, to explain or support its merits as an idealized inductive method. 

As commonly presented, however, the association with two main such principles 
(firstly, the principle of indifference , and secondly, the principle of Occam’s razor) 
seems to essentially rest on the definition of A u as a universal transformation of the 
uniform measure X. 

This raises the question whether the a priori semimeasures (as we will call the 
functions Xu here) must be defined, as they always are, as the universal transfor¬ 
mations of the uniform measure, or that the a priori semimeasures can equivalently 
be defined as universal transformations of other computable measures. 
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The main result of this paper is that any a priori semimeasure can indeed be ob¬ 
tained as a universal transformation of any continuous computable measure. That 
is, for any continuous computable measure, an a priori semimeasure can equiva¬ 
lently be defined as giving the probabilities for finite strings being generated by a 
universal machine that is presented with a stream of bits sampled from this mea¬ 
sure. More precisely, for any continuous computable measure p, it is shown that 
the class of functions A u for all universal monotone machines U coincides with the 
class of functions pu (i.e., the transformation by U of p) for all (/^-compatible) 
universal machines U. 

This work will be done in Section 2. First, in the current section, we cover basic 
notions and notation (Subsection 1.1), discuss the characterization of the semicom- 
putable semimeasures as the transformations via monotone machines of a contin¬ 
uous computable measure (Subsection 1.2), and the analogous characterization for 
semicomputable discrete semimeasures and prefix-free machines (Subsection 1.3). 

l. 1. Basic notions and notation. 

Bit strings. Let B := (0,1} denote the set of bits; B* the set of all finite bit strings; 
B" the set of bit strings a of length \cr\ = n; B-" the set of bit strings cr of 
length |cr| < n; B w the class of all infinite bit strings. The empty string is e. The 
concatenation of bit strings a and r is written err; we write cr =<: r if a is an initial 
segment of r (so there is a p such that up = r; we write a -< r if p ^ e). The 
initial segment of a of length n < \a\ is denoted cr ( n ; the initial segment cr f|,r|_i 
is denoted a~. Strings cr and r are comparable, a ~ r, if cr ==£ r or r -< cr; if cr and 
t are not comparable we write a \ t. 

For given finite string cr, the class [a] := {aX : X £ B“} C B“ is the class of 
infinite extensions of cr. Likewise, for A C B*, let [AJ := {aX : a € A, X £ B w }. 

Computable measures. A probability measure over the infinite strings is generated 
by a premeasure , a function m : B* —>- [0,1] that satisfies 

(1) m(e) = 1; 

(2) m(crO) + m(crl) = m(<r) for all cr £ B*. 

A premeasure m gives rise to an outer measure p.^ : V(M U1 ) —»• [0,1] by 

/CM) = inf | ^2 m (<?) -AClAj 

l crGA 

By restricting to the p- measurable sets, i.e., the sets dCB“ such that p*{B) = 
p*(BnA)+p*(B\A) for all B C B“, we finally obtain the corresponding (probability) 
measure p m , that satisfies /U m ([cr]) = m(cr) for all a £ B*. 

The uniform (Lebesgue) measure A is given by the premeasure m with to(ct) = 
2 - l CT l for all cr £ B*. A measure p is nonatomic or continuous if there is no X £ 
with p({X}) > 0. 

We call a total real-valued function / : B* —»• K. computable if its values are 
uniformly computable reals: there is a computable g : B* x N —> Q such that 
| g(a, k) — f (cr) | < 2~ k for all cr, k. This allows us to talk about computable premea¬ 
sures. A measure p we then call computable if p = p^ for a computable premeasure 

m. 
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Semicomputable semimeasures. We call a total real-valued function / : B* —> R 
(lower) semicomputable if there are uniformly computable functions ft. : B* —>• Q 
such that for all a £ B*, we have f t +i(a) > ft(a) for all t £ N and lim^oo / t (cr) = 
/(»• 

Levin [23, Definition 3.6] introduced the notion of a semicomputable measure 
over the collection B* U of finite and infinite strings. This is equivalent to a 
semimeasure over the infinite strings that is generated from a premeasure m that 
only needs to satisfy 

(1) m(e) < 1; 

(2) to(ctO) + m(crl) < m(<j) for all a £ B*. 

Following [5], we will simply treat a semimeasure as a function over the cones 
(W : o- e B*}: 

Definition 1.1. A semicomputable semimeasure is a function v : {[<j] : cr 6 B*} -> 
[0,1] such that !/([•]) : B* —>• [0,1] is semicomputable, and 

(1) KH) <!; 

(2) i/([[cr0]) + z/([alj) < i/(|cr]) for all cr £ B*. 

Moreover, we follow the custom of writing v(a) for ^([cr]). Let At denote the 
class of all semicomputable semimeasures. 1 

1.2. Monotone machines and semicomputable semimeasures. 

Machines. The following definition is due to Levin [10]. (Similar machine models 
were already described in [23], and by Solomonoff [20] and Schnorr [19]; see [3].) 

Definition 1.2. A monotone machine is a c.e. set M C B* x B* of pairs of strings 
such that if (pi,<Ti), (/92><T2) £ M and p\ ^ p 2 then a\ ~ 02 . 

We will not go into the concrete machine model that corresponds to the above 
abstract definition (see, for instance, [5, p. 145]); we only note that a machine M as 
defined above induces a function Nm :B*UB“ -aB*UB w by Nm(X) = sup Act £ 
B* :3p4 X(( Pl a) £ M)} (cf. [7]). 

Transformations. Imagine that we feed a monotone machine M a stream of input 
that is generated from a computable measure p.. As a result, machine M produces 
a (finite or infinite) stream of output. The probabilities for the possible initial seg¬ 
ments of the output stream are themselves given by a semicomputable semimeasure 
(as can easily be verified). We will call this semimeasure the transformation of p 
by M. 

Definition 1.3. The transformation pm of computable measure p by monotone 
machine M is defined by 

p M (a) \= p(\{p : 3a' a((p,a') £ M)}j). 


1 Semimeasures as defined here are often referred to as continuous semimeasures, in contradis¬ 
tinction to the discrete semimeasures defined in Subsection 1.3 below (cf. [13, 5]). Due to the 
possibility of confusion with the earlier meaning of “continuous” as synonymous to “nonatomic,” 
we will avoid this usage here. 
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Characterizations of M. For every given semicomputable semimeasure u, one can 
obtain a machine M that transforms the uniform measure A to v. Together with 
the straightforward converse that every function Am defines a semicomputable 
semimeasure, this gives a characterization of the class A4 of semicomputable semimea¬ 
sures as 

(1) M = {A m}m, 

where {Am}m is the class of functions A m for all monotone machines M. 

A proof of this fact by a construction of an M that transforms A to given v was 
first outlined by Levin in [23, Theorem 3.2]. (Also see [13, Theorem 4.5.2].) More¬ 
over, it can be deduced from [23, Theorem 3.1(b), 3.2] that A4 can be characterized 
as the class of transformations of computable measures other than A. Namely, we 
have that Ad coincides with {pm}m for any computable /i that is continuous. 

A detailed construction to prove the characterization (1) was published by Day 
[4, Theorem 4(ii)]. (Also see [5, Theorem 3.16.2(h)].) The following proof of the 
case for any continuous computable measure is an adaptation of this construction. 

Theorem 1.4 (Levin). For every continuous computable measure p, there is for 
every semicomputable semimeasure v a monotone machine M such that v = pm- 

Proof. Let v be any semicomputable semimeasure, with uniformly computable ap¬ 
proximation functions ft- We construct in stages s = (cr, t) a monotone machine 
M that transforms p into v. Let D s (a) := {p G 1* : (p, a) € M s j. 

Construction. Let M 0 := 0. 

At stage s = (a,t ), if /x([D s _i(ct)J) = f t (a) then let M s := M s _i. 

Otherwise, first consider the case cr ^ e. By Lemma 1 in [4] there is a set R C B s 
of available strings of length s such that [1?]] = [D s _i(er~)]| \ ([D s _i(cr”0)] U 
][AA S —i(cr 1)]). Denote x := /Lt([i?]), the amount of measure available for descrip¬ 
tions for cr, which equals /r(|D s _i(cr~)J) - ^i([[D s _i(cr - 0)]) - /x([D s _i(cr“l)J) be¬ 
cause we ensure by construction that ][D s _i(cr _ )J D [D s _i(cr _ 0)J U jD s _i(cr”l)] 
and [.D s _i(<7~0)] 0 ][.D s _i(<t - 1)] = 0. Denote y := f t (a) - /x([[Z? s _i(cr)]), the 
amount of measure the current descriptions fall short of the latest approximation 
of zc(cr). We collect in the auxiliary set A s a number of available strings from R 
such that /x([A s J) is maximal while still bounded by min{x,j/}. 

If cr = e, then denote y := f t (e) — ^i([[D s _i(e)]). Collect in A s a number of 
available strings from R C M s with [i?] = BA \ JD s _i(e)J such that ^i([[A s ]) is 
maximal but bounded by y. 

Put M s := M s - 1 U {(p, cr) : p e A s }. 

Verification. The verification of the fact that Ad is a monotone machine is identical 
to that in [4], 

It remains to prove that pm(p) = v(a) for all cr g B*. Since by construction 
[D s (f7')J C pA s (cr)] for any a' cr, we have that pm 3 (&) = /^IV^pA^cr')]) = 
/r(JI? s (cr)J). Hence pm(p) = lim^oo p([[IA s (cr)]), and our objective is to show that 
linig^oo p([D s (cr)]) = zc(cr). To that end it suffices to demonstrate that for every 
<5 > 0 there is some stage so where p(lD So (cr)J) > u(a) — S. We prove this by 
induction. 

For the base step, let cr = e. Choose positive S' < S. There will be a stage 
sq = (e, to) where /t 0 (e) > zc(e) — S' , and (since p is continuous) /i([p]) < 5 — 5' for 
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all p £ B s °. Then, if not already /r([D So _i(e)]) > i/(e) — 5, the latter guarantees that 
the construction will select a number of available strings in A Sq such that v(e) — 5< 
M[A> 0 -i(<OJ) +/i([^ s ]) < f to (e). It follows that /r([D So (e)J) = jn([D s „-i(e)]) + 
p([A s ]|) > i/(e) —6 as required. 

For the inductive step, let <r e, and denote by a' the one-bit extension of a~ 
with a' | a. Choose positive S' < S. By induction hypothesis, there exists a stage 
Sq such that /x([D s ^(cr _ )]) > j/(ct _ ) — S'. At this stage Sg, we have 

M[A,'(OD “MIA,'O')]) > Kl D s ’ 0 (Ol -I'O'O 

> v(cr~) — S' — v(cr') 

> v{a) — S', 

where the last inequality follows from the semimeasure property v{a~) > v(a) + 
v(<?')■ There will be a stage so = {a, to) > Sq with /t 0 (cr) > v{<j) — S' and p(|pj) < 
5 —S' for all p £ B s °. Clearly, min{/i([D So (ct“)])-/ x([£) So (ct')]), ft 0 (c)} > v{<j)-8'. 
Then, as in the base case, if not already /x([[D So _i(cr)]) > v{a)—S, the construction 
selects a number of available descriptions such that p(lD So (cr)]) > u{c r) — 8 as 
required. □ 

Corollary 1.5. For every continuous computable measure p., 

{pm}m = M. 

1.3. Prefix-free machines and discrete semimeasures. The notions of a semi- 
computable discrete semimeasure on the finite strings and a prefix-free machine can 
be traced back to Levin [11] and Gacs [6], and independently Chaitin [1], 

Definition 1.6. A semicomputable discrete semimeasure is a semicomputable func¬ 
tion P : B* -A R~° such that X^gb* P( a ) — 1- 

Definition 1.7. A prefix-free machine is a partial computable function T : B* —»• 
B* with prefix-free domain. 

Definition 1.8. The transformation of computable measure p by prefix-free ma¬ 
chine T is the semicomputable discrete semimeasure Qtf : B* —» [0,1] defined by 

Qff(a) := p{{{p : (p, a) £ T}J). 

Let V denote the class of all semicomputable discrete semimeasures. Analogous 
to class M and the monotone machines, class V is characterized as all prefix-free 
machine transformations of p , for any continuous computable p. The fact that every 
P can be obtained as a transformation of A is usually inferred from the effective 
version of Kraft’s inequality (e.g., [5, p. 130], [14, Exercise 2.2.23]). However, we 
can easily prove the general case in a direct manner by a much simplified version 
of the construction for Theorem 1.4. 

Proposition 1.9. For every continuous computable measure p , there is for every 
semicomputable discrete semimeasure P a prefix-free machine T such that P = Qtf. 

Proof. Let P be any semicomputable discrete semimeasure, with uniformly com¬ 
putable approximation functions ft- We construct a prefix-free machine T in stages 
s = (cr, t). Let D s (a) = {p6B* : (p, a) £ T s }. 
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Construction. Let To = 0. 

At stage s = (a,t ), if /x([T s _i(ct)J) = f t (a) then let T s := T s _i. 

Otherwise, let the set R C B s of available strings be such that |P]| = \ 

[U rG B* D s _i(r)J. Collect in the auxiliary set A s a number of available strings p 
from R with Y^peA MU) maximal but bounded by ft(cr) — MjT> s _i(cr)]), the 
amount of measure the current descriptions fall short of the latest approximation 
of P(cr). Put T s := T s - 1 U {(p,a) ■ p £ A s }. 

Verification. It is immediate from the construction that U ctG b* D s (a) is prefix-free 
at all stages s, so T = lim s _>oo T s is a prefix-free machine. To show that Q^(a) = 
lim^oo /z([_D s (cr)]) equals P(ct) for all a £ B*, it suffices to demonstrate that for 
every <5 > 0 there is some stage so where /x([[Ti> So (cr)]]) > P(cr) — c>. 

Choose positive 6' < 6. Wait for a stage So = (cr, to) w hh /z([p]) < S — S' for all 
p £ B so and /t 0 (cr) > P(cr) — S'. Clearly, the available /z-measure 


MM) = 1 - ^(Pso-iM)!) 

tGB* 

> 1 -/i([T> S0 _i(cr)]) - P ( r ) 

tGB*\{ct} 

> P(cr) -/z([P S0 -l(o-)]) 

> /to(o') - MPM 0 -i(o)])- 

Consequently, if not already /z(J£) So _i(cr)]) > P(ct) — S, then the construction 
collects in A So a number of descriptions of length s 0 from R such that p(PM 0 (cr)J) = 
MPMo-iM)]) + J2 p gA 30 M(IP]) > p ( cr ) - $ as required. □ 

Corollary 1.10. For every continuous computable measure /z, 

{Qt)t = V. 


2. The a priori semimeasures 

In this section we show that the class of a priori semimeasures can be char¬ 
acterized as the class of universal transformations of any continuous computable 
measure. Subsection 2.1 introduces the class of a priori semimeasures. Subsection 
2.2 is an interlude devoted to the representation of the a priori semimeasures as 
universal mixtures. Subsection 2.3 presents the generalized characterization, and 
concludes with a brief discussion of how this reflects on the association with foun¬ 
dational principles. 

2.1. A priori semimeasures. 

Universal machines. Let {p e }eeN Q ®* be any computable prefix-free and non¬ 
repeating enumeration of finite strings, that will serve as an encoding of some 
computable enumeration {Af e } eG N of all monotone machines. We say that a mono¬ 
tone machine U is universal (by adjunction) if for some such encoding {p e } eG N, we 
have for all p, a £ B* that 


(p e p, <r)£[/o (p, a) £ M e . 
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By a universal machine we will mean a machine that is universal by adjunction. 
Contrast this to weak universality, which is the more general property that for all 
M there is a cm £ N such that 

(p,cr) £ M =>■ 3p'(\p'\ < \p\ + c M & ( p,cr) £ U). 

A priori semimeasures. We call a transformation by a universal machine a universal 
transformation. The a priori semimeasures are the universal transformations of the 
uniform measure. 

Definition 2.1. An a priori semimeasure is defined by 

Xu{a) := A([{p : 3a 1 4 cr((p,cr') £ 17)}]) 
for universal monotone machine U. 

Let A denote the class {A[/}c/ of a priori semimeasures. The next result implies 
that every element of A can also be obtained as the transformation of A by a 
machine that is not universal. 

Proposition 2.2. For every continuous computable measure p, there is for every 
semicomputable semimeasure v a non-universal monotone machine M such that 
v = p M - 

Proof. Let U be an arbitrary universal machine. We will adapt the construction of 
Theorem 2.5 of a machine M with pm = v in such a way that for every constant 
c £ N there is a a such that for some p' with (p', a) £ U, we have that \p\ > \p’\ + c 
for all p with (p, a) £ M. This ensures that M is not even weakly universal. □ 

Construction. The only change to the earlier construction is that at stage s we try 
to collect available strings of length l s , where l s is defined as follows. Let Iq = 0. 
For s = (a, t ) with t > 0, let l B = l s _ 1 + 1. In case s = (a, 0), enumerate pairs in U 
until a pair (p', a) for some p' is found. Let l s := max{7<,_i + 1, \p'\ + s}. 

Verification. The verification that pm = v proceeds as before. In addition, the 
construction guarantees that for every c £ N, we have for a with c = {a, 0) that 
l/°! > \p'\ + c f° r the first enumerated p' with (p',a) £ U and all p with (p,a) £ 
M. 1 □ 

2.2. Universal mixtures. Every element of A is equal to a universal mixture 

( 2 ) 

jGN 

for some effective enumeration (PijigN = Af of all semicomputable semimea¬ 
sures, and some semicomputable weight function W : N —>• [0,1] that satisfies 
£ ieN ^( i) < 1 and W(i) > 0 for all i. Conversely, one can show that every 
universal mixture equals A jj for some universal machine U [22]. 

Let U denote the elements k of A4 that are universal in the sense that they 
dominate every other semicomputable semimeasure. That is, for such n £ Li there 
is for every v £ A4 a constant c v £ N, depending only on n and v , such that 
k(o) > c“ 1 p(cr) for all cr £ B*. It is clear from the mixture form of the a priori 
semimeasures that A CIA. This inclusion is strict: not all universal elements are 
of the form A u (equivalently, mixtures). For instance, £w(e) < 1 for all W because 
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^(e) < 1 for some v £ A4, but we can obviously define a universal k £ M with 
K(e) = 1. 

We can strengthen the above statement of the equivalence of the a priori semimea- 
sures and the universal mixtures by requiring a computable weight function IF over 
a fixed enumeration as follows. 

First, let us call an enumeration {I'jjigN of all semicomputable semimeasures 
acceptable if it is generated from an enumeration {Mi\i of all monotone Turing ma¬ 
chines by the procedure of Theorem 1.4, i.e., v t = Am , ■ This terminology matches 
that of the definition of acceptable numberings of the partial computable functions 
[18, p. 41]. Every effective listing of all Turing machines yields an acceptable num¬ 
bering. Importantly, any two acceptable numberings differ only by a computable 
permutation [17]; in our case, for any two acceptable enumerations {fi}i and {Pi}i 
there is a computable permutation / : N —> N of indices such that Pi = 

Furthermore, let us call a semicomputable weight function W proper if JT W(i) = 
1 ; this implies that W is computable. 

Then we can show that for any acceptable enumeration of all semicomputable 
semimeasures, all elements in A are expressible as some mixture with a proper 
weight function over this enumeration. 

Proposition 2.3. For every acceptable enumeration of A4, every element in 
A is equal to £w{') = YYi for some proper W. 

Proof. Given A u £ A, with enumeration {Mi}i of all monotone machines corre¬ 
sponding to U. We know that A u is equal to £w(') = for accept¬ 

able enumeration {Pi}i = {Am, }?: of At and semicomputable weight function W. 
First we show that is equal to £w'{') = Yli for given acceptable 

enumeration {i->i}i and semicomputable W'\ then we show that it is also equal to 
€w"{‘) = W"{i)vi{-) for proper W". 

Since enumerations {vi}i and {t'eje ar e both acceptable, there is a 1-1 com¬ 
putable / such that Pi = Then 


^ TT(»?(•) =Y^ w (i)vf (■»(■) 

i i 

i 

i 

with IF' : i n- lF(/ _ 1 (i)). 

We proceed with the description of a proper IF". The idea is to have IF" assign 
to each i a positive computable weight that does not exceed W'(i ), additional 
computable weight to the index of a single suitably defined semimeasure in order 
to regain the original mixture, and all of the remaining weight to an “empty” 
semimeasure. 

Let q € Q be such that fw' (e) < q < 1, and let c be such that ]Tb 2 _I_C < 1 — < 7 . 
Let Wq(z) denote the first approximation of semicomputable W'{i) that is positive. 
We now define computable g : N —> Q by 

g{i) = min{2 _I_c , IFg(z)}. 
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Clearly, E; g(i) < 1 — <?• Moreover, E,: <?(?’) is computable because for any 6 > 0 
we have ajeN with Ei>j 2 _l_c < S , hence E*<j ff(*) < E»ff(*) < E,:<j ff(*) + <*■ 
Next, define 7r(-) = q~ l Ej (W'(i) — g(z)) This is a semimeasure because 
7 r(e) < g _1 ^vv'(e) < g _1 <7 = 1. Let k be such that Vk = 7r, and let l be such that vi 
is the “empty” semimeasure with i z(cr) = 0 for all cr S B’ (both indices exist even 
if we cannot effectively find them). 

Finally, we define W" by 

(g{i) if MM 

fT"(z) = < g(i) + q if i = k ■ 

if * = ^ 

weight function IT" is computable and indeed proper, and 

E W"{i)vi{-) = + qv k {-) + 0 

z z 

= E fi'O'MG) + 51 (^'W- s(*)) ^i(-) 

z z 

z 

□ 

As a kind of converse, we can derive that any universal mixture is also equal to a 
universal mixture with a universal weight function, i.e., a weight function W such 
that for all other W there is a cw with W(i) > c w 1 W / (i) for all i. 

Proposition 2.4. For every acceptable enumeration of A4, every element in 
A is equal to £iv(') = Ej W{i)vi{-) for some universal W. 

Proof. By the above proposition we know that any given element in A equals 
t.w = E» W(i)vi for some (computable) W over given { 1 /,}^. Let k be such that 
Pfc = Ei 2~ K< ' l h /j, with K{i) the prefix-free Kolmogorov complexity (via some uni¬ 
versal prefix-free machine U ) of the i-tli lexicographically ordered string; 2 ~ K A is 
a universal weight function. Define 

- _ f W(i) + W{k) • 2-^W if i ± k 

* ~~ \W(k) -2~ k ^ if i = k’ 

which is a weight function because E; W(i) < Ei^fcf^CO +W(k) = EjW(*)- 
Moreover, W is universal because 2~ K ^ is, and 

E = E W(iM-) + W(k) E 2~ K ^ Vl {.) 

i iy£k i 

= E^(*M(-)- 

z 

□ 

Flutter [8, p. 102-03] argues that a universal mixture with weight function 2~ K A 
is optimal among all universal mixtures, essentially because this weight function 
is universal. The above result shows that this optimality is meaningless: every 
universal mixture can be represented so as to have a universal weight function. 
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2.3. The generalized characterization. We are now ready to show that the 
universal transformations of any continuous computable measure p yield the same 
class A of a priori semimeasures. A minor caveat is that we will need to restrict 
the universal machines U to those machines with associated encodings {p e }e that 
do not receive measure 0 from p: so /x([p e ]|) > 0 for all e E N. Call (the associated 
encodings of) those machines compatible with measure p. This is clearly no restric¬ 
tion for measures that give positive probability to every finite string (such as the 
uniform measure): all machines are compatible to such measures. 

We will prove: 

Theorem 2.5. Let p, p be continuous computable measures. For universal machine 
U that is compatible with p, there is universal machine U such that pu = p^. 

It follows that {pu}u = {hjj^u f° r an y two con ti nuous computable p and p, 
with U ranging over those universal machines compatible with p and U over those 
universal machines compatible with p. In particular, since A is itself a continuous 
computable measure, we have that {pu}u = A. 

Our proof strategy is to expand the approach taken in [22] to show the coinci¬ 
dence of the a priori semimeasures and the universal mixtures. Let us first derive 
the fact that a universal transformation of p is an a priori semimeasure. 

Proposition 2.6. Let p be a continuous computable measure and universal ma¬ 
chine U compatible with p. Then pu E A. 

The proof rests on a fixed-point lemma that is a refined version of Corollary 
1.5. For given encoding { p e } e , define p e (-) := p(- | [p e ]|) for any e E N. Here the 
conditional measure /i([r] | |cr]|) := for any a, r E B*. 

Lemma 2.7. Given encoding {/9 e } e eN of the monotone machines as above. For 
every continuous computable measure p, 

(AJe = Ai. 

Proof. Let v be any semicomputable semimeasure. Since p e is obviously a com¬ 
putable measure for every e E N, by the construction of Theorem 1.4 we obtain for 
every e a monotone machine M with v = p e M . Indeed, there is a total computable 
function g : N —> N that for given e retrieves an index g(e) in the given enumeration 
{M e }eGN such that v = p e M ( . But by the Recursion Theorem, there must be a 
fixed point e such that M g ^ = Me, hence p e M . = p e M . 

This shows that for every v there is an index e such that v = p e M . Conversely, 
the function p e M is a semicomputable senrinreasure for every e. □ 


Proof of Proposition 2.6. Given continuous computable p and universal U compat¬ 
ible with p. We write out 
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Hu{a) = p([{p : 3a' a{{p,a') G 17)}]) 

= E P(l{PeP ■ *= a((p, a) G M e )}]) 

e 

= X! KlPeDKliP : 3cr ' > <7((P, a 1 ) G Me)}] | [p e ]) 
e 

e 

Lemma 2.7 tells us that the p e Me range over all elements in A4. Moreover, 
W (e) := /i([p e ]) is a weight function because (p e } e is prefix-free and U is compatible 
with p, so pu is a universal mixture. □ 

We now proceed to prove that every universal transformation of p indeed equals 
some universal transformation of p. 

Proof of Theorem 2.5. Given continuous computable p and p, and universal U com¬ 
patible with p. Write out as before 

MO) =J2p(lPei)p e M e (cr). 

e 

Note that the function 


P(a) 


p([cr]) if a = p e for some e G N 
0 otherwise 


is a semicomputable discrete semimeasure. Hence by Proposition f.9 we can con¬ 
struct a prefix-free machine T that transforms p into P: so Qj, = P. Denote n e := 
#{r : (r, p e ) G T} the number of T-descriptions of p e , and let (•, •) : N x N —► N be 
a partial computable pairing function that maps the pairs (e, i) with i < n e onto 
N. Let P{ e ,i) be the i-th enumerated T-description of p e . We then have 


^ PilPei)p e M e {(r) = J2 Q T(Pe)PM e ( a ) 

e e 

= E E M[P(e,i>J)MM e (<A 

e i<n e 

Write p d for p{- | [pd])- Now for every (e,i) for which P( e ,i) becomes defined we 
can run the construction of Theorem i.4 on and p e Me - In this way we obtain 
an enumeration of machines {Ma}d such that p~ ^ = p e M (with i < n e ) for all 
e. Then 

EE i‘([P(e,.)])/ , Me( ff ) = ' 52 l P'(lPdl)P' 3 M d (<r), 

e i<n e d 

which we can rewrite to pjj(a), defining U by ( pdP , a) G U (p, a) G Ma¬ 
li remains to verify that U is in fact universal. Namely, we cannot take for 
granted that {Md}d£N is an enumeration of all machines, hence it is not clear that 
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U is universal. J Note that it is enough if there were a single universal machine U' 
in but even that is not obvious (by Proposition 2.2 we know that for all 

continuous p there are for any universal U non-universal M such that Pm = Pu)- 
However, there is a simple patch to the enumeration that guarantees this fact. 
Namely, given an arbitrary universal machine V, we may simply put Md := V 

at some d = (e,i) where it so happens that fiy’^ = p e M ■ (We cannot effectively 
find this d, but it is finite information so if this d exists then so does the patched 
enumeration.) 

Our final objective is then to show that fly = p e M for some e,i. Define 

computable g : N —> N by P%i g(e) = Py’°^ ■ Since Q^{p e ) > 0 for each e, the 

string p (efi) is defined for each e. Hence fly’ 0 ' 1 is defined, and function g, that 
retrieves the index g{e) of a machine that transforms p e to this semimeasure, is 
total. Then by the Recursion Theorem there is index e such that Mg = M g ( e ) ■, so 

e e ~ (^,0) i—| 

MMg — ^ M g(e) — ^ 

Corollary 2.8. For continuous computable p, and U ranging over those universal 
machines that are compatible with g, 


{pu}u = A. 

Discrete a priori semimeasures. A universal prefix-free machine U is defined by 

{peP, cr) GU (p, a) G T e 

for all p, a G B* and some computable prefix-free and non-repeating enumeration 
{p e } e gN C B* that serves as an encoding of some computable enumeration {T e } eg N 
of all prefix-free machines. 

Definition 2.9. A discrete a priori semimeasure is defined by 

Qu(cr) := A([{p : (p, a) G 17}]) 
for a universal prefix-free machine U. 

Let Q denote the class of all discrete a priori semimeasures. Discrete versions 
of the above results are derived in an identical manner. Ultimately, we have the 
following discrete analogue to Corollary 2.8. 

Proposition 2.10. For continuous computable p. and U ranging over those prefix- 
free machines that are compatible with p, 

{Qti)u = Q. 

Discussion. We now return to the association of the function A u (as well as its 
discrete counterpart Q^j) with foundational principles. 

First, there is the association with the principle of insufficient reason or in¬ 
difference. This is the principle that in the absence of discriminating evidence, 
probability should be equally distributed over all possibilities. Solomonoff writes, 
“If we consider the input sequence to be the ‘cause’ of the observed output sequence, 

“This is also an (overlooked) issue in the original proof in [22, Lemma 4], It is easily resolved 
by the same approach we take below, where it is immediate that for universal V there is e with 
Ay = y e . 
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and we consider all input sequences of a given length to be equiprobable (since we 
have no a priori reason to prefer one rather than the other) then we obtain the 
present model of induction.” [20, p. 19]. Also see [12, 16]. 

Second, there is the association with Occam’s razor. Solomonoff writes, “That 
[this model] might be valid is suggested by ‘Occam’s razor,’ one interpretation of 
which is that the more ‘simple’ or ‘economical’ of several hypotheses is the more 
likely ... —the most ‘simple’ hypothesis being that with the shortest ‘description.’” 
[20, p. 3]. Also see [21, 13, 9, 2, 15]. 

Note that so stated, these associations very much rely on the fact that the 
uniform measure A always assigns larger probability to shorter strings, and equal 
probability to equal-length strings. This is a unique feature of A. The results of this 
paper, however, imply that the choice of the uniform measure in defining algorith¬ 
mic probability is only circumstantial: we could pick any continuous computable 
measure, and still obtain, as the universal transformations of this measure instead 
of A, the very same class of a priori semimeasures. This suggests that properties 
derived from the presence of A in the definition are artifacts of a particular choice of 
characterization rather than an indicative property of algorithmic probability, and 
hence undermines both associations insofar as they indeed hinge on the uniform 
measure. 
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